Reducing Genome Assembly Complexity with Optical Maps Mid-year Progress Report

نویسندگان

  • Lee Mendelowitz
  • Mihai Pop
چکیده

The goal of genome assembly is to reconstruct contiguous portions of a genome (known as contigs) given short reads of DNA sequence obtained in a sequencing experiment. De Bruijn graphs are constructed by finding overlaps of length k − 1 between all substrings of length k from the reads, resulting in a graph where the correct reconstruction of the genome is given by one of the many possible Eulerian tours. The assembly problem is complicated by genomic repeats, which allow for exponentially many possible Eulerian tours, thereby increasing the de Bruijn graph complexity. Optical maps provide an ordered listing of restriction fragment sizes for a given enzyme across an entire chromosome, and therefore give long range information that can be useful in resolving genomic repeats. The algorithms presented here align contigs to an optical map and then use the constraints of these alignments to find paths through the assembly graph that resolve genomic repeats, thereby reducing the assembly graph complexity. The goal of this project is to implement the Contig-Optical Map Alignment Tool and the Assembly Graph Simplification Tool and to use these tools to simplify the idealized de Bruijn graphs for several bacterial genomes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reducing Genome Assembly Complexity with Optical Maps Final Report

The goal of genome assembly is to reconstruct contiguous portions of a genome (known as contigs) given short reads of DNA sequence obtained in a sequencing experiment. De Bruijn graphs are constructed by finding overlaps of length k − 2 between all substrings of length k − 1 from reads of at least k bases, resulting in a graph where the correct reconstruction of the genome is given by one of th...

متن کامل

Reducing Genome Assembly Complexity with Optical Maps

De Bruijn graphs provide a framework for genome assembly, where the correct reconstruction of the genome is given by one of the many Eulerian tours through the graph. The assembly problem is complicated by genomic repeats, which allow for many possible Eulerian tours, thereby increasing the de Bruijn graph complexity. Optical maps provide an ordered listing of restriction fragment sizes for a g...

متن کامل

An algorithm for assembly of ordered restriction maps from single DNA molecules.

The restriction mapping of a massive number of individual DNA molecules by optical mapping enables assembly of physical maps spanning mammalian and plant genomes; however, not through computational means permitting completely de novo assembly. Existing algorithms are not practical for genomes larger than lower eukaryotes due to their high time and space complexity. In many ways, sequence assemb...

متن کامل

A physical map of the human genome

The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial art...

متن کامل

Whole Genome Optical Mapping

An innovative new technology, optical mapping, is used to infer the genome map of the location of short sequence patterns called restriction sites. The technology, developed by David Schwartz, allows the visualization of the maps of randomly located single molecules around a million base pairs in length. The genome map is constructed from overlapping these shorter maps. The mathematical and com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011